Swap-based Clustering for Location-based Services
نویسنده
چکیده
Clustering is an unsupervised learning method widely used in many fields, such as machine learning, pattern recognition, data mining and image analysis. The goal of this study is to investigate swap-based clustering and its application to location-based services. Swap-based clustering is a local search heuristic trying to find the optimal centroid locations by performing a sequence of centroid swaps between existing centroids and a set of candidate centroids. Firstly, the thesis presents several swap-based clustering algorithms, such as random swap, deterministic swap and hybrid swap which is a combination of random and deterministic swap. Then we propose a simple and efficient swap-based clustering algorithm, called smart swap. It performs the swap by finding the nearest pair among the centroids and sorting the clusters by their distortion values, and then it swaps one of the nearest pair centroids to any position in the cluster from the clusters list sorted by distortion value. K-means iteration is employed to repartition the dataset and fine-tune the swapped solution. Experimental results of swap-based clustering methods on both synthetic datasets and real datasets are provided and analyzed. Finally, we study location-based services and in one specific application, MOPSI project. We then apply the clustering in the MOPSI applications to reduce the clutter problem in map visualization in different scales, using a split smart swap clustering method to cluster the user locations and using a grid-based clustering with bounding box method to cluster the photo collections. Experimental results in the studied web applications show that the split smart swap method works in real-time but is slow for large dataset, and grid-based clustering method works with good clustering result and significant fast speed.
منابع مشابه
A Clustering Based Location-allocation Problem Considering Transportation Costs and Statistical Properties (RESEARCH NOTE)
Cluster analysis is a useful technique in multivariate statistical analysis. Different types of hierarchical cluster analysis and K-means have been used for data analysis in previous studies. However, the K-means algorithm can be improved using some metaheuristics algorithms. In this study, we propose simulated annealing based algorithm for K-means in the clustering analysis which we refer it a...
متن کاملNew spatial clustering-based models for optimal urban facility location considering geographical obstacles
The problems of facility location and the allocation of demand points to facilities are crucial research issues in spatial data analysis and urban planning. It is very important for an organization or governments to best locate its resources and facilities and efficiently manage resources to ensure that all demand points are covered and all the needs are met. Most of the recent studies, which f...
متن کاملApplication of Combined Local Object Based Features and Cluster Fusion for the Behaviors Recognition and Detection of Abnormal Behaviors
In this paper, we propose a novel framework for behaviors recognition and detection of certain types of abnormal behaviors, capable of achieving high detection rates on a variety of real-life scenes. The new proposed approach here is a combination of the location based methods and the object based ones. First, a novel approach is formulated to use optical flow and binary motion video as the loc...
متن کاملDeveloping a ChatBot to Answer Spatial Queries for use in Location-based Services
A Chat Bot is an automated operator that can interact with customers like a human operator, answer their questions, solve problems and get feedback. Real-time responsiveness, the sense of talking to a human user is one of their good features that can be used to deliver location-based services. This paper designed a Chat Bot that can talk and answer users' questions based on their location. Thi...
متن کاملDirect Marketing Based on Fuzzy Clustering of Customers (Case Study: on one Mobile Company)
Objective There is a general tendency toward direct marketing these days. Therefore, instead of designing advertisement and marketing strategies for all the customers in the market, it is recommended to classify the customers based on clustering techniques and then design specific strategies accordingly. This will reduce marketing and advertisement expenses, increase sale department efficientl...
متن کامل